Skip to content

add models #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 21, 2025
Merged

add models #42

merged 6 commits into from
Jul 21, 2025

Conversation

BoyuanFeng
Copy link
Contributor

@BoyuanFeng BoyuanFeng commented Jun 22, 2025

This PR adds more benchmarks to ci. After this PR, we cover the following settings:

image

@BoyuanFeng BoyuanFeng changed the title add gemma-3-27b-it and qwen3_30B-A3B add models Jul 20, 2025
@@ -99,7 +183,112 @@
}
},
{
"test_name": "serving_llama4_maverick_fp8_tp8",
"test_name": "serving_llama4_scout_tp4_random_in200_out200",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we need to have a better way to separate these cases with different input/output shapes on the dashboards. At the moment, there are only tensor parallel size and request rate https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm

cc @yangw-dev if you have time to pick this up

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these model!

@BoyuanFeng BoyuanFeng merged commit 60beea6 into main Jul 21, 2025
42 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants